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ABSTRACT 

The complexity and accuracy of current and future "precision cosmology" observational campaigns has made 
it essential to develop an efficient technique for directly combining simulation and observational datasets to deter- 
mine cosmological and model parameters; a procedure we term calibration. Once a satisfactory calibration of the 
underlying cosmological model is achieved, independent predictions for new observations become possible. For 
this procedure to be effective, robust characterization of the uncertainty in the calibration process is highly desir- 
able. In this Letter, we describe a statistical methodology which can achieve both of these goals. An application 
example based around dark matter structure formation simulations and a synthetic mass power spectrum dataset 
is used to demonstrate the approach. 

Subject headings: Cosmology: cosmological parameters — cosmology: theory 
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1. INTRODUCTION 

It is widely recognized that, beginning in the last decade, 
a transition to an era of "precision cosmology" is well under- 
way. Ongoing and upcoming surveys such as the Wilkinson 
Microwave Anisotropy Probe (WMAP, Spergel et al. 2006), 
the Sloan Digital Sky Survey (SDSS, Adelman-McCarthy et al. 
2006), Planck, the Dark Energy Survey (DES), the Joint Dark 
Energy Mission (JDEM), the Large Synoptic Survey Telescope 
(LSST), and Pan-STARRS constitute superb sources of cosmo- 
logical statistics. These sources include (galaxy, cluster, and 
mass) power spectra and cluster mass functions, from which 
roughly 25 cosmological parameters have to be constrained 
(see, e.g., Spergel et al. 2006, Tegmark et al. 2003, Abaza- 
jian et al. 2005). The promised accuracy from future observa- 
tions is remarkable, as some parameters can be measured at the 
1% level or better, posing a major challenge to cosmological 
theory. Predictions and analysis methods must at least match 
- and preferably substantially exceed - the observational ac- 
curacy. For many observables, this can only be achieved by 
simulations incorporating physical effects beyond the reach of 
analytic modeling. 

Cosmological simulations already play a key role in the de- 
sign and interpretation of observations. Controlling systemat- 
ics is a necessary first step, followed by combining simulations 
with observations to extract cosmological and model parame- 
ters. This cannot be accomplished by brute force. For exam- 
ple, if every parameter is sampled only ten times in a twenty- 
dimensional parameter space, it would require lO^*' large-scale 
simulations, which is currently - and in the near-term - quite 
impossible. Even the variation of only a subset of the param- 
eters over a sufficient range is infeasible. The need to develop 
and employ reliable statistical methods to determine and con- 
strain parameters robustly is therefore manifest. 

In this Letter we describe a statistical framework to de- 
termine cosmological and model parameters and associated 
uncertainties from simulations and observational data (for an 
overview of the basic ideas see, e.g., Kennedy & O'Hagan 
2001 and Goldstein & Rougier 2004). The framework inte- 



grates a set of interlocking procedures: (i) simulation design - 
the determination of the parameter settings at which to carry 
out the simulations; (ii) emulation - given simulation output 
at the input parameter settings, how to estimate the output at 
new, untried settings; (iii) uncertainty and sensitivity analysis 
- determining the variations in simulation output due to un- 
certainty or changes in the input parameters; (iv) calibration - 
combining observations (with known errors) and simulations to 
estimate parameter values consistent with the observations, in- 
cluding the associated uncertainty; (v) prediction - using the 
calibrated simulator to predict new cosmological results with a 
set of uncertainty bounds. 

For concreteness, we discuss the framework methodology in 
terms of a simple example application: Estimation of five pa- 
rameters from dark matter structure formation simulations and 
a synthetic set of "WMAP -i- SDSS" measurements of the mat- 
ter power spectrum. A detailed description will be provided 
elsewhere (S. Habib et al. in preparation). 

2. THE STATISTICAL FRAMEWORK 

We employ a Bayesian framework to update prior probabihty 
distributions on cosmological parameters given observational 
data. Denoting these parameters collectively by 9 and the ob- 
served power spectrum data by a vector jobs, we model the data 
as: 

3'obs = ?7(6') + e, (1) 

where ri(9) denotes the simulation output at input setting 6, and 
e ^ A^(0, Sj) where S, describes the error structure of the ob- 
servations and any potential systematic differences between the 
simulated and observed data. 

Standard Bayesian estimation (Jeffreys 1961) proceeds using 
the likelihood 

L(yobs|e)oc |S],,r'/'exp{-i[3;obs-??(^)]^S;'[}'obs-r?(e)]}, (2) 

and a prior Tr{9) to form the posterior distribution on 9: 

7r(0|yobs)ocL(3;obs|^)^(e). (3) 
Because the resulting posterior distributions are not in any eas- 
ily recognized closed form, they must be explored numerically, 



2 



Cosmic Calibration 



usually using Markov Chain Monte Carlo (MCMC) techniques 
(Besag et al. 1995). This procedure requires running the (po- 
tentially very expensive) simulation codes many thousands of 
times as the 6'-space is explored. However, only a limited num- 
ber (~ 100? ^ 1000?) of runs may be feasible. Therefore, effi- 
ciently combining Bayesian methods with simulations requires 
a representation of the code output (emulator) that can be sam- 
pled many thousands of times during the course of the MCMC 
in lieu of running the actual code. When queried at an input 
setting where a code run is available, the emulator should re- 
produce the output of the code. At other input settings, the em- 
ulator effectively interpolates nearby code runs while including 
uncertainty due to the lack of complete knowledge of the code 
output. The selection of input settings in the simulation de- 
sign must be sufficiently dense that the emulator can accurately 
mimic the code output, and also be sufficiently sparse that the 
simulation campaign is computationally tractable. 

Systematic design of simulation procedures is reviewed in 
Santner et al. (2003). We use orthogonal array-based Latin 
hypercube sampling (Tang 1993) to fix 128 input settings over 
the five parameters. This approach takes an orthogonal array 
design - which ensures that all lower-dimensional projections 
have desirable space-filling properties - and modifies it so that 
it is also a Latin hypercube, the most efficient stratified sam- 
pling strategy. 

The code output for the /th input setting is a power spectrum, 
y^'\k) = rjifii), viewed as a column vector over the nk points in 
k space. Each of the resulting = 128 output spectra is loaded 
into a single «t x matrix: y^itn?. = [y'ly^'l • • • b*"'^]- This ma- 
trix is then subjected to a singular value decomposition (SVD) 
to find an efficient empirical orthogonal representation of the 
simulation outputs: [jsimsl/j = [USV'^lj = YTp=\ \Aap]iWp{6j) 
where the a^'s are x 1 orthogonal basis vectors in the sim- 
ulation output space (columns of U), the A,'s are the singular 
values of the simulation matrix, and each principal component 
(PC) weight Wp{9) is a 1 x n., row vector in the parameter space 
(columns of V). Usually the first few singular values dominate 
the remainder, allowing us to keep only a few of the principal 
components in the analysis. In what follows, we have kept five 
PC's. 

The SVD gives the PC weights at the design input settings 
(01 , 02, • ■ ■ J ^«,)- However, in the course of the MCMC, we need 
the PC weights at intermediate input settings. We construct the 
emulator by putting a spatial Gaussian Process (GP) model on 
each PC weight (Sacks et al. 1989, MacKay 1998), a nonhn- 
ear interpolation scheme that works directly on the space of 
functions. This allows the emulator to smoothly interpolate the 
predicted code output between the design settings, giving an ef- 
ficient probabilistic representation of the prediction uncertainty. 
The spatial parameters controlling the GP on each component 
weight are estimated in the course of the MCMC, thereby con- 
structing the emulator as needed during the calibration analysis. 
Details of the procedures used in our code are being reported 
elsewhere in the literature. For recent examples of these tech- 
niques used in practice, see Higdon et al. (2004). 

3. PARAMETER ESTIMATION AND THE NONLINEAR MATTER 
POWER SPECTRUM 

In order to give an explicit demonstration of the approach, 
we first generate a synthetic observational dataset from simu- 
lations. The key advantage of doing this is that the underly- 
ing set of cosmological parameters are known, allowing a di- 
rect test of the statistical procedure. To generate the "observa- 



tions" we begin with a smooth power spectrum computed by 
running ten realizations of the same cosmology with the paral- 
lel panicle mesh (PM) code MC^ (Cf. Heitmann et al. 2005 
for code information and comparison results) and averaging 
over the results. The initial conditions are set using CMB- 
FAST (Seljak & Zaldarriaga 1996). We restrict our study to 
the linear and quasi-linear regime relevant to large-scale struc- 
ture surveys; the force resolution of the PM-code accurately 
resolves the scales of interest. 

We form a single power spectrum by attaching the linear 
Piik), 0.001/iMpc-i <k< O.l/zMpc-^ (growth specified by lin- 
ear theory), to the power spectrum from simulations P^ik) at 
k = 0.1/zMpc"'. Next, 28 points from this combined power 
spectrum are picked, spaced roughly in the same bins as in 
a real dataset. The error bars are set by values typical for 
cosmic microwave background (CMB) experiments such as 
WMAP (Spergel et al. 2006) in the low-fc range, transitioning 
to values typical of surveys such as SDSS (Adelman-McCarthy 
et al. 2006) at higher k. Finally, points are moved off the base 
power spectrum according to a Gaussian distribution with a 1- 
sigma confidence, as shown in Figure 1 . Note that for this test 
demonstration we are assuming that galaxy bias has already 
been incorporated in the measurement. In a more realistic situ- 
ation, the bias would be included as part of the modeling pro- 
cess. Note also that the choice of a homogeneous observational 
dataset here is merely for convenience. For a heterogeneous 
dataset such as CMB C/'s combined with P{k) measured from 
the galaxy distribution, yobs [Cf. Eqn. (1)] would also contain 
the CMB results and the simulations underlying rj{d) would in- 
clude runs of (say) CMBFAST. 

We consider the following five cosmological parameters: 
6 = (n,/!,crg, ilcDM, f^b)- We assume a flat ACDM universe with 
6 = (0.99,0.71,0.84,0.27,0.044) to make the synthetic obser- 
vations (black line in Figure 1). To determine the simulation 
design we must fix the range that the input parameters should 
be varied over To this end, we assume independent, flat priors 
over the ranges: 0.8 < « < 1.4, 0.5 < /i < 1.1, 0.6 < erg < 1-6, 
0.05 < f^cDM < 0.6, and 0.02 < Ob < 0.12. The simulation de- 
sign prescribes a set of 128 input settings. This number of simu- 
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Fig. 1. — Subset of the 128 .simulated power spectra and the synthetic 
dataset. The black line is the spectrum from which the synthetic data were 
derived. 
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Fig. 2. — Posterior density for the parameter vector 9. The diagonal gives 
estimates of the univariate marginal pdfs for each component; blue: results 
from the entire synthetic dataset; green using only the linear regime (k < 
O.lAMpc"')- Off-diagonal images show estimates of the bivariate marginal 
pdfs: upper triangle for the entire dataset, lower triangle for the linear regime. 
The lines give estimates of the 90% highest posterior density region. Again, 
blue is from using the entire dataset; green from using only the linear regime. 
The dots show the actual parameter values used to generate the synthetic ob- 
servations. 

lations, as we show below, yields an emulator with performance 
at the few percent level, sufficient for our present purposes. 

Each run is carried out with 128^ particles on a 512^ grid 
for a 450/i~'Mpc box, guaranteeing sufficient force resolution 
for the scales of interest. To limit systematic biases, different 
seeds are used to generate the initial Gaussian random field for 
each simulation. Cosmic variance is minimized by matching 
the numerical to the linear power spectrum in the linear regime 
near the ACDM power spectrum peak. We show a subset of 
the 128 power spectra in Figure 1. The emulator is now built 
as described above - note that the emulator is called only as 
needed by the MCMC analysis in the calibration process. 

4. RESULTS 

The posterior distribution of the five cosmological parame- 
ters is depicted in Figure 2. The diagonal displays the uni- 
variate, marginal pdfs for each of the parameters, while the 
off-diagonal plots show estimated 2-d marginal densities, along 
with 90% probability contours. For comparison. Table 1 gives 
the mean value of the parameters along with the estimated un- 
certainty, as well as the "true" value for each parameter These 
posterior estimates are obtained under two separate formula- 
tions - one which uses all of the synthetic observation data, and 
one which uses only the observations in the linear regime for 
which k < 0.1/;Mpc~'. The green pdfs and contours in Figure 2 
show the posterior results including information only from the 
linear regime, whereas the blue pdfs and the contours result 
from an analysis of the full nonlinear power spectrum. Because 
of the limited observational dynamic range, using only the lin- 
ear regime results in systematic shifts from the "true" answers, 
albeit within the quoted uncertainties. Overall, we find f^cDM 
and CTg to be very well determined. The full nonlinear analysis 
over the entire ^-range significantly improves the accuracy for 
(Tg and r^cDM as well as the precision of the constraint for ag 



(see Table 1). While the synthetic dataset provides information 
about the remaining three parameters, n, h, and ilb, they are not 
as well constrained as is to be expected from an analysis re- 
stricted to the matter power spectrum only. Note that the linear 
analysis underestimates the uncertainty in n. 

The posterior distribution describes the uncertainty regarding 
the parameter vector 6 as well as statistical variance and corre- 
lation parameters that control the response surface model. Once 
these posterior samples have been produced, it is straightfor- 
ward to generate posterior realizations of the emulator to assess 
its adequacy in modeling the simulated output. The accuracy 
of the emulator was estimated by excluding individual simula- 
tion runs and building a new emulator based on the remaining 
127 power spectra. The emulator predictions can now be com- 
pared against the actual simulation output of the excluded run. 
Three examples of applying this procedure are shown in the left 
plot in Figure 3. The accuracy of the emulator turns out to be 
extremely good, at the level of a few percent, which is very 
adequate for the present analysis. The right panel in Figure 3 
summarizes the residuals for all 128 simulations - the central 
gray band delineates the middle 50% of the residuals; the light 
gray band delineates the middle 90%. Gaussian process models 
offer a number of advantages over other methods for modeling 
simulation output; they do not require runs over a grid of in- 
put settings; they allow for interpolation of the simulation out- 
put; they can accommodate fairly general interactions between 
input parameters; and typically outperform other modeling ap- 
proaches. For example, the GP model gives substantially better 
predictions as compared to a quadratic response surface model, 
a generalized additive model (GAM), or a multivariate additive 
regression spline model (MARS) (Hastie et al. 2001). 

The fitted emulator can be used to explore the sensitivity of 
the simulation output to changes in the cosmological parame- 
ters. Figure 4 shows how the log of the power spectrum changes 
as one parameter is varied, the others being fixed at their prior 
midpoints. Both o-g and 51cdm have a large impact when varied 
over their prior ranges. Hence it is not surprising that the pos- 
terior distribution for these two parameters are the most con- 
strained by the observed data. Figure 4 also suggests that while 
most parameters affect the power spectrum in the linear regime 
(k < O.l/iMpc"'), only cg affects the power spectrum in the 
nonlinear regime (k > O.l/iMpc"'). Thus, while additional data 
in the nonlinear regime is likely to help constrain erg, it will not 
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Fig. 3. — Evaluation of the emulator fit. Left: Three simulations (black 
dots) and the corresponding response surface fits (green lines) obtained after 
holding out the simulation to be predicted and training the response surface on 
the remaining 127 simulations. Right: Residual (simulation log P- response 
surface) from holdout predictions (i.e. the simulation being predicted is not 
used to estimate the response surface). The central gray region contains the 
middle 50% of the residuals; the wider light gray region, the middle 90%. 
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greatly reduce uncertainty in the other four parameters. 

5. CONCLUSION AND OUTLOOK 

We have introduced a new, very powerful method for deter- 
mining cosmological and model parameters from simulations 
and observations. The key idea is to extract maximum utility 
from a necessarily finite set of expensive simulations. The im- 
plementation of this idea includes several valuable features: (i) 
a design to optimally sample the simulation parameter space; 
(ii) an accurate emulator capable of generating the required 
outputs in between the sampled simulation points; (iii) an un- 
certainty and sensitivity analysis; (iv) the parameter constraints 
themselves, with associated uncertainty bounds. 

In order to demonstrate the basic approach, we used a set 
of 128 dark matter structure formation simulations and a ho- 
mogeneous synthetic "observational" dataset to determine five 
cosmological parameters. The next step is to use the framework 
for analyses of real data, especially of combined datasets such 
as the CMB and large scale structure observations. 

There are many ways to enhance the method and improve 
its performance. One is the melding of information from codes 
with different degrees of resolution and input physics, such as in 
the extraction of information about the mass distribution from 
the Lyman-a forest. Here, complex hydrodynamics simulations 
are certainly desirable, but much faster approximate methods 
such as hydro-particle mesh (HPM) are available. Thus, a first 
analysis based on HPM can be performed, narrowing the pa- 
rameter range of interest sufficiently to make hydro runs feasi- 
ble. Interesting offshoots of the methodology include the ex- 
ploitation of certain intermediate results. For instance, a large 
set of N-body simulations can be performed with several input 
parameters such as the equation of state for dark energy. An 
emulator can then be constructed from these and publicly re- 
leased. This emulator can then be conveniently used instead of 



real simulations for planning observations and data analysis. 
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Table 1 
Parameter Constraints 



Param. Mean""""" Mean"" True Value 

"n QS9l^^ 0.940!;;^^^ 099 

h 0.786!;;i23 0.765!°^^ 0.71 

ag 0.882!»[1^2 o.962!!]|2i 0.84 

ricDM 0.287!0138 0.343!015« 0.27 



'CDM ^-^C'-Oias ^•-'^■'-0 130 
+0.052 A nc/i+0.052 
-0.034 ^-^-^^-0.031 



rib 0.057!0"52 0.054!0!'52 0.044 



Note. — Mean value for the full and Unear {k < 0.1/iMpc ' ) datasets with 
their 90% intervals, and the true value for the five parameters under investigation. 



